-
Notifications
You must be signed in to change notification settings - Fork 612
Update the cache row dim calculation in TBE SSD #4480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
This pull request was exported from Phabricator. Differential Revision: D77321062 |
Summary: X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
8e9bcfb
to
656d0ba
Compare
Summary: X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
This pull request was exported from Phabricator. Differential Revision: D77321062 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D77321062 |
Summary: Pull Request resolved: pytorch#4480 X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
be04ecf
to
04209be
Compare
Summary: X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
Summary: X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
This pull request was exported from Phabricator. Differential Revision: D77321062 |
Summary: Pull Request resolved: pytorch#4480 X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
This pull request was exported from Phabricator. Differential Revision: D77321062 |
Summary: Pull Request resolved: pytorch#4480 X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
Summary: X-link: facebookresearch/FBGEMM#1537 - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam. Reviewed By: emlin, jiawenliu64 Differential Revision: D77321062
This pull request was exported from Phabricator. Differential Revision: D77321062 |
This pull request has been merged in 619b6ab. |
Summary: - The current cache row dim calculation in TBE SSD assumes that optimizers have state sizes that are fixed relative to table dimensions. This change updates the cache row dim calculation to account for optimizers whose states' sizes depends on the row length, such as Partial Rowwise Adam.
Reviewed By: emlin, jiawenliu64
Differential Revision: D77321062